Code positioning to reduce instruction cache misses in signal processing applications on multimedia RISC processors

نویسندگان

  • Hans-Joachim Stolberg
  • Masao Ikekawa
  • Ichiro Kuroda
چکیده

Real-time operation of signal processing applications on multimedia RISC processors is often limited by high instruction cache miss rates of direct-mapped caches. In this paper, a heuristic approach is presented which reduces high instruction cache miss rates in direct-mapped caches by code positioning. The proposed algorithm rearranges functions in memory based on trace data so as to minimize cache line con icts. Moreover, a new method to extract potential cache misses from trace data is introduced which enables accurate cache behavior analysis and greatly enhances code positioning e ciency. Application of code positioning to an MPEG-1 video decoder implementation on the V830 multimedia RISC processor reduced instruction cache re ll cycles by 66{98 %. The proposed code positioning algorithm does not require hardware modi cations; it can easiliy be integrated in an object linker to automate the optimization process.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evaluation of architectural support for speech codecs application in large-scale parallel machines

Next generation multimedia mobile phones that use the high bandwidth 3G cellular radio network consume more power. Multimedia algorithms such as speech, video transcodecs have very large instruction foot prints and consequently stalled due to instruction cache misses. The conflicts in on-chip caches contribute a large fraction of the CPU cycle penalty and hence increase in power consumption. Ma...

متن کامل

Temporal Distribution Based Software Cache Partition To Reduce I-cache Misses

As multimedia applications on mobile devices become more computationally demanding, embedded processors with one level I-cache become more prevalent, typically with a combined I-cache and SRAM of 32KB ~ 48KB total size. Code size reduction alone is no longer adequate for such applications since program sizes are much larger than the SRAM and I-cache combined. For such systems, a 3% I-cache miss...

متن کامل

Improving Memory-System Performance of Sparse Matrix-Vector Multiplication

Sparse matrix-vector multiplication is an important kernel that often runs inefficiently on superscalar RISC processors. This paper describes techniques that increase instruction-level parallelism and improve performance. The techniques include reordering to reduce cache misses originally due to Das et al., blocking to reduce load instructions, and prefetching to prevent multiple load-store uni...

متن کامل

Code Positioning for VLIW Architectures

Several studies have considered reducing instruction cache misses and branch penalty stall cycles by means of various forms of code placement. Most proposed approaches rearrange procedures or basic blocks in order to speed up execution on sequential architectures with branch prediction. Moreover, most works focus mainly on instruction cache performance and disregard execution cycles. To the bes...

متن کامل

Multimedia Processors - Proceedings of the IEEE

This paper describes recent large-scale-integration programmable processors designed for multimedia processing such as real-time compression and decompression of audio and video as well as the generation of computer graphics. As the target of these processors is to handle audio and video in real time, the processing capability must be increased tenfold compared to that of conventional microproc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997